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Abstract Given any finite and closed chemical reaction system, it is possi¬ 
ble to efficiently determine whether or not it contains a ‘self-sustaining and 
collectively autocatalytic’ subset of reactions, and to find such subsets when 
they exist. However, for systems that are potentially open-ended (for example, 
when no prescribed upper bound is placed on the complexity or size/length 
of molecules types), the theory developed for the finite case breaks down. We 
investigate a number of subtleties that arise in such systems that are absent 
in the finite setting, and present several new results. 
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1 Introduction 

Consider any system of chemical reactions, in which certain molecule types 
catalyse reactions and where there is a pool of simple molecule types available 
from the environment (a ‘food source’). One can then ask whether, within 
this system, there is a subset of reactions that is both self-sustaining (each 
molecule can be constructed starting just from the food source) and collectively 
autocatalytic (every reaction is catalysed by some molecule produced by the 
system or present in the food set) 0,0- This notion of ‘self-sustaining and 
collectively autocatalytic’ needs to be carefully formalised (we do so below), 
and is relevant to some basic questions such as how biochemical metabolism 
began at the origin of life 0: m, M- A simple mathematical framework 
for formalising and studying such self-sustaining autocatalytic networks has 
been developed - so-called ‘RAF (Reflexively-autocatalytic and F-generated) 
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theory’. This theory includes an algorithm to determine whether such networks 
exists within a larger system, and for classifying these networks; moreover, the 
theory allows us to calculate the probability of the formation of such systems 
within networks based on the ligation and cleavage of polymers, and a random 
pattern of catalysis. 

However, this theory relies heavily on the system being closed and finite. 
In certain settings, it is useful to consider polymers of arbitrary length being 
formed (e.g. in generating the membrane for a protocell SI)- In these and other 
unbounded chemical systems, interesting complications arise for RAF theory, 
particularly where the catalysis of certain reactions is possible only by molecule 
types that are of greater complexity/length than the reactants or product of 
the reactions in question. In this paper, we extend earlier RAF theory to deal 
with unbounded chemical reaction systems. As in some of our earlier work, 
our analysis ignores the dynamical aspects, which are dealt with in other 
frameworks, such as ‘chemical organisation theory’ [I]; here we concentrate 
instead on just the pattern of catalysis and the availability of reactants. 


1.1 Preliminaries and definitions 

In this paper, a chemical reaction system (CRS) consists of (i) a set X of 
molecule types, (ii) a set 7 Z of reactions, (iii) a pattern of catalysis C that 
describes which molecule(s) catalyses which reactions, and (iv) a distinguished 
subset F of X called the food set. 

We will denote a CRS as a quadruple Q = (X, 7 Z } C,F), and encode the 
pattern of catalysis C by specifying a subset of X x TZ so that ( x,r ) £ C 
precisely if molecule type x catalyses reaction r. See Fig.|T]for a simple example 
(from [HI]). 

In certain applications, X often consist of - or at least contain - a set of 
polymers (sequences) over some finite alphabet A (i.e. chains X\X 2 • • ■ x r , r > 1, 
where x * £ A), as in Fig. [lj such polymer systems are particularly relevant 
to RNA or anrino-acid sequence models of early life. Reactions involving such 
polymers typically involve cleavage and ligation (i.e. cutting and/or joining 
polymers), or adding or deleting a letter to an existing chain. Notice that if 
no bound is put on the maximal length of the polymers, then both X and 1Z 
are infinite for such networks, even when |A| = 1. 

In this paper we do not necessarily assume that X consists of polymers, or 
that the reactions are of any particular type. Thus, a reaction can be viewed 
formally as an ordered pair (A, B ) consisting of a multi-set A of elements from 
X (the reactants of r) and a multi-set B of elements of X (the products of 
r); but we will mostly use the equivalent and more conventional notation of 
writing a reaction in the form: 

r = (a\ + a 2 + • ■ ■ + Gfc —> bi + &2 + 111 + bi), 

where the af s (reactants of r) and b 3 ’s (products of r) are elements of X, and 
k, l > 1 (e.g. x —> y,x + x —> y and y —> x + x' are reactions). 
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Fig. 1 A simple CRS based on polymers over a two-letter alphabet (0,1), with a food set 
F = (0,1.00,11} and seven reactions. Dashed arrows indicate catalysis; solid arrows show 
reactants entering a reaction and products leaving. In this CRS there are exactly four RAFs 
(defined below), namely {ri,r2}, { r 3} . (ri, T2, 7*3}, and {ri , r : > , 7*3, r .5 [. 


In this paper, we extend our earlier analysis of RAFs to the general (finite 
or infinite) case and find that certain subtleties arise that are absent in the 
finite case. We will mostly assume the following conditions (Al) and (A2), and 
sometimes also (A3). 

(Al) F is finite; 

(A2) each reaction r GTZ has a finite set of reactants, denoted p(r), and a finite 
set of products, denoted 7r(r); 

(A3) for any given finite set Y of molecule types, there are only finitely many 
reactions r with p(r) = Y. 

Given a subset 7 V of 7 Z, we say that a subset W C X of molecule types is closed 
relative to TV if W satisfies the property r £ TV and p(r) C W => n (r) C W. 
In other words, a set of molecule types is closed relative to TV if every molecule 
that can be produced from W using reactions in TV is already present in W. 
Notice that the full set X is itself closed. The global closure of F relative to 
TV, denoted here as gcl K /(F), is the intersection of all closed sets that contain 
F (since X is closed, this intersection is well defined). Thus gcl K ,(F) is the 
unique minimal set of molecule types containing F that is closed relative to 
TV. 

We can also consider a constructive closure of F relative to TV, denoted 
here as ccl n'(F), which is union of the set F and the set of molecule types x 
that can be obtained from F by carrying out any finite sequence of reactions 
from TV where, for each reaction r in the sequence, each reactant of r is either 
an elements of F or a product of a reaction occurring earlier in the sequence, 
and £ is a product of the last reaction in the sequence. 
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Note that gel K ,(F) always contains ccl u'(F) (and these two sets coincide 
when the CRS is finite) but, for an infinite CRS, ccl u'{F) can be a strict 
subset of gcl^F), even when (Al) holds. To see this, consider the system 
(X,V) where X = {x 0 ,x 1 ,x 2 ,F = {/}, where 1V = {r 0 , n, r 2 , r 3 ,...} is 
defined as follows: 

ri = (f -t zi); 

rj = (/ + Xj ->■ Xj+i), for all j > 1; 

r 0 = ( X 1 + x 2 + ■ ■ ■ —> Xo). 

Then Xq £ gcl^/(F) — ccl-fc^F). In this example, notice that ro has infinitely 
many reactants, which violates (A2). By contrast, when (A2) holds, we have 
the following result. 

Lemma 1 Suppose that (A2) holds. Then ccl tz>(F) = gcl K , (F). Moreover, 
under (Al) and (A2), if V is countable, then this (common) closure of F 
relative to VJ is countable also. 

Proof Suppose the condition of Lemma[l]holds but that ccl-^/ ( F ) is not closed; 
we will derive a contradiction. Lack of closure means there is a molecule x 
in X — CCI7 z'{F) which is the product of some reaction r £ V that has all 
its reactants in ccl n'(F). By (A2), the set of reactants of r is finite, so we 
may list them as x±, x 2 , ■ ■ •, Xk, and, by the definition of CCI 7 z>(F), for each 
i £ {1, either Xi £ F or there is a finite sequence S) of reactions 

from V! that generates Xi starting from reactants entirely in F and using just 
elements of F or products of reactions appearing earlier in the sequence Si. By 
concatenating these sequences (in any order) and appending r at the end, we 
obtain a finite sequence of reactions that generate x from F, which contradicts 
the assumption that ccl ii'(F) is not closed. If follows that ccIr.'(.F') is closed 
relative to V , and since it is clearly a minimal set containing F that is closed 
relative to V!, it follows that ccl n'(F) = gc\- JZ ,(F). That CCI 7 z>(F) is countable 
under (Al) and (A2) follows from the fact that any countable union of finite 
sets is countable. □ 

In view of Lemma [lj whenever (A2) holds, we will henceforth denote the 
(common) closure of F relative to V as cl- r<(F). 

Definition [RAF, and related concepts] Suppose we have a CRS Q = 
(A, V, C, F), which satisfies condition (A2). An RAF for Q is a non-empty 
subset V of V for which 

(i) for each r £ V, p(r ) C cl- r>(F); and 

(ii) for each r £ V, at least one molecule type in cIt^-F) catalyses r. 

In words, a non-empty set V of reactions forms an RAF for Q if, for every 
reaction r in V', each reactant of r and at least one catalyst of r is either 
present in F or able to be constructed from F by using just reactions from 
within the set V!. 
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An RAF 1Z' for Q is said to be a finite RAF or an infinite RAF depending on 
whether or not \R!\ is finite or infinite. The concept of an RAF is a formalisa¬ 
tion of a ‘collectively autocatalytic set’, pioneered by Stuart Kauffman [5] and 
[5j. Since the union of any collection of RAFs is also an RAF, any CRS that 
contains an RAF necessarily contains a unique maximal RAF. An irrRAF is 
an (infinite or finite) RAF that is minimal - i.e. it contains no RAF as a strict 
subset. In contrast to the uniqueness of the maximal RAF, a finite CRS can 
have exponentially many irrRAFs [7]. 

The RAF concept needs to be distinguished from the stronger notion of 
a constructively autocatalytic and F-generated (CAF) set [TO] which requires 
that TV can be ordered rq, rq,..., rjv so that all the reactants and at least 
one catalyst of r, are present in clr T . lj ... ir ._ 1 i (F) for all i £ {1,..., N} (in the 
initial case where i = 1, we take clg(F) = F). This condition essentially means 
that in a CAF, a reaction can only proceed if one of its catalysts is already 
available, whereas an RAF could become established by allowing one or more 
reactions r to proceed uncatalysed (presumably at a much slower rate) so that 
later, in some chain of reactions, a catalyst for r is generated, allowing the 
whole system to ‘speed up’. Notice that although the CRS in Fig. [T] has four 
RAFs it has no CAF. 



Fig. 2 Examples of a finite RAF (that is not a CAF), a finite CAF and a finite pseudo-RAF 
(that is not an RAF). In these examples, the molecule types are round nodes (the food set 
is denoted /i, / 2 , • • •, and pi,p 2 , • • • are products), reactions are hollow squares, and dashed 
arrows indicate catalysis. 
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The RAF concept also needs to be distinguished from the weaker notion of 
a pseudo-RAF |12j . which replaces condition (ii) with the relaxed condition: 

(ii)': for all r G TV , there exists x G F or x G n(r) for some r G TV such 
that (x, r) G C. 

In other words, a pseudo-RAF that fails to be an RAF is an autocatalytic 
system that could continue to persist once it exists, but it can never form 
from just the food set F, since it is not F-generated. 

These two alternatives notions to RAFs are illustrated (in the finite setting) 
in Fig. [2] Notice that every CAF is an RAF and every RAF is a pseudo-RAF, 
but these containments are strict, as Fig. [2] shows. 

While the notion of a CAF may seem reasonable, it is arguably too con¬ 
servative in comparison to an RAF, since a reaction can still proceed if no 
catalyst is present, albeit it at a much slower rate, allowing the required cat¬ 
alyst to eventually be produced. However relaxing the RAF definition further 
to a pseudo-RAF is problematic (since a reaction cannot proceed at all, unless 
all its reactants are present, and so such a system cannot arise spontaneously 
just from F). This, along with other desirable properties of RAFs (their for¬ 
mation requires only low levels of catalysis in contrast to CAFs suggests 
that RAFs are a reasonable candidate for capturing the minimal necessary 
condition for self-sustaining autocatalysis, particularly in models of the origin 
of metabolism. 


1.2 Properties of RAFs in an infinite CRS 

As in the finite CRS setting, the union of all RAFs is an RAF, so any CRS 
that contains an RAF has a unique maximal one. It is easily seen that an 
infinite CRS that contains an RAF need not have a maximal finite RAF, even 
under (Al)-(A3), but in this case, the CRS would necessarily also contain an 
infinite RAF (the union of all the finite RAFs). 

A natural question is the following: if an infinite CRS contains an infinite 
RAF, does it also contain a finite one? It is easily seen that even under condi¬ 
tions (Al) and (A2), the answer to this last question is ‘no’. We provide three 
examples to illustrate different ways in which this can occur. This is in contrast 
to CAFs, for which exactly the opposite holds: if a CRS contains an infinite 
CAF, then it necessarily contains a sequence of finite ones. Moreover, two of 
the infinite RAFs in the following example contain no irrRAFs (in contrast to 
the finite case, where every RAF contains at least one irrRAF). 

Example 1: Let X = {f,x u .. ., x n ,...}, F = {/} and7 1 = {n,r 2 ,... ,r n ,...}. 
Let n = (/ — > Xi). We will specify particular CRS’s by describing r 2 ,r 3 ,..., 
and the pattern of catalysis as follows. 

— Qi has a reaction r t = (/ + Xi -1 —► Xi) for each i > 1 and r, is catalysed 
by Xi +1 for each i > 1 . 
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— Q 2 has a reaction n = (/ + / + ••• + f[i times] —> xi) for each i > 1 and 
Ti is catalysed by Xi +1 for each z > 1 . 

— 0.3 has the same reactions as Q 2 but r, is now catalysed by every Xj : j > i. 
Fig. [3] illustrates the three CRS’s. 



Each of Qi, Q2, Q3 satisfy (Al) and (A2), but only Q\ satisfies (A3). All 
three CRSs contain infinite RAFs, but no finite RAF, and no CAF. More 
precisely: 

— Qi has TZ as its unique RAF (which is therefore an irrRAF). 

— The RAFs of Q 2 consist precisely of all subsets of {r.j, r J+ i,..., } for some 
j. Thus Q 2 has a countably infinite number of RAFs but no irrRAF. 

— The RAFs of Q 3 consist precisely of all infinite subsets of TZ. Thus, the set 
of RAFs for Q 3 in uncountably infinite, and it contains no irrRAF. 


2 Determining whether or not a CRS contains an RAF 

In this section, we assume that both (Al) and (A2) hold. Given a CRS Q = 
(X,TZ,C,F), consider the following nested decreasing sequence of reactions: 
TZi, IZ 2 ,..., defined by TZi = 1Z and for each i > 1 : 

1Zi +1 = {r e TZi : p{r) C cl n^F), and 3x € cl^F) : (x,r) £ C}. (1) 

Thus, IZi+i is obtained from TZi by removing any reaction that fails to have 
either all its reactants or at least one catalyst in the closure of F relative to 
TZi- Let n{Q) = f|i>i TZi■ It is easily shown that any RAF TV present in Q is 
necessarily a subset of p(Q) (since TV C TZi f° r all i > 1 by induction oni). 
Thus if p(Q) = 0 then Q does not have an RAF. In the finite case there is 
a strong converse - if p(Q) ^ 0 then Q has an RAF, and p(Q) is the unique 
maximal RAF for Q (this is the basis for the ‘RAF algorithm’ [5] and 0 ). 
However, in contrast, this result can fail for an infinite CRS, as we now show 
with a simple example, which also satisfies (A1)-(A3). 
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Example 2: Consider the following infinite CRS, Q 4 = (X,1Z,C,F), where 
F = {/}, and X = {f,s,t, } U { 011 , 0 : 2 , 2 : 3 , where 

^ = {/, //, ///,--• • -} 

(this set can be thought of as all polymers of /). The reaction set is TZ = 
{ri,r 2 ,r 3 , ...} U {r' 2 ,r' 3 ,. ..}, where, for all* > 1 : 

ri = (/ + / (i) ^/ (l+1) ); 

r'i = (/ w -t Xi + s). 

The pattern of catalysis is defined as follows: s catalyses n and t catalyses 
r 2 , and for all z > 1 /W catalyses r, and Xi catalyses r' i+1 . This CRS is 
illustrated in Fig [4j Notice that Q 4 satisfies (Al), (A2) and (A3). However, if 



Fig. 4 An infinite CRS Q 4 which has no RAF even though ^(£ 4 ) is non-empty (equal to 
{Vi, r 2 ,...}). This CRS satisfies (A1)-(A3) and (A5), but not (A4). 


we construct the sequence IZi described above, then as the sole catalyst (t) of 
r 2 is neither in the food set, nor generated by any other reaction, it follows 
that r ' 2 will be absent from 1Z 2 , and so r ' 3 will also be absent from 1Z 3 (since 
the only catalyst of r 3 is produced by r 2 ). Continuing in this way, we obtain 
niQi) = {j~i, r 2l r 3 ,...}, but this set is not an RAF, since the sole catalyst 

s of j"i does not lie lie in the closure of F relative to {ri,r 2 ,r 3 ,...} - it was 

produced by the r' reactions and in these have all disappeared in the limit; 

moreover it is clear that no subset of Q 4 is an RAF. □ 

Thus, we require slightly stronger hypotheses than just (A1)-(A3) in order 
to ensure that Q has an RAF when /i(Q) ^ 0. This, is provided by the following 
result. 
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Proposition 1 Let Q = (X,1Z,C, F) satisfy (Al) and (A2). The following 
then hold: 

(i) /i(Q) contains every RAF for Q; in particular, if p,(Q) = 0, then Q has no 
RAF. 

(ii) Suppose that Q satisfies both of the following further conditions: 

(M) rv 1 cl Hi(F) c cl fj,(Q){F), for the sequence TZi defined in Uv. 

(A 5) Each reaction r £1Z is catalysed by only finitely many molecule types. 
Then Q contains an RAF if and only if p(Q) is non-empty (in which case, 
p(Q) is the maximal RAF for Q). 


Before proving this result, we pause to make some comments and obser¬ 
vations concerning the new conditions (A4) and (A5). Regarding Condition 
(A4), containment in the opposite direction is automatic (by virtue of the fact 
that f(C\Yi) C r\if(Yi) for any function / and sets Yf), so (A4) amounts to 
saying that the two sets described are equal. 

Notice also that Q 4 in Example 2 (Fig. [4]) satisfies (A5) but it violates 
(A4), as it must, since Q 4 does not have an RAF. To see how Q 4 violates 
(A4), notice that cl p (q 4 )(F) = T, while f|i>i cI-r^F) = -A U {s}. 

Condition (A5) is quite strong, but Proposition [l] is no longer true if it is 
removed. To see why, consider the following modification Q 4 of Q 4 in which 
the only product of r' (for i > 1 ) is Xi, and Xi catalyses rq for all i > 1 (in 
addition to <+1 )> as shown in Fig. 0 Then cl„ (ei) (F) = = -A 

so (A4) holds; however p(Q '^) = {ri,r 2 , ■..} which, as before, is not an RAF 
for Q ' 4 since there is no catalyst of rq in cb ri>r2) .. \(F). Notice that (A5) fails 
for Q 4 since r 1 has infinitely many catalysts. Nevertheless, it is possible to 
obtain a result that dispenses with (A5) at the expense of a strengthening 
(A4), which we will do shortly in Proposition [2j 



Fig. 5 An infinite CRS which has no RAF even though is non-empty (equal to 

{fi, 7 - 2 ) • • •})• This CRS satisfies (A1)-(A3) and (A4), but not (A5), nor (A4)'. 
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Proof of Proposition^ 7J' Suppose TV is any RAF for Q. Induction on i > 1 
shows that TV C 7 Z- L for all i, so that TV C p,(Q); in particular, if p{Q) = 0, 
then Q has no RAF. The proof of part (ii) of Proposition [l] relies on a simple 
lemma. 

Lemma 2 Suppose that ( A i7 i > 1) is any nested decreasing sequence of sub¬ 
sets and B is a finite set for which AiP\B ^ 0 for alii > 1. Then some element 
of B is present in every set Aj. 

Proof of lemma: Suppose, to the contrary, that for every element b £ B, there 
is some set A^ in the sequence that fails to contain b (we will show this is 
not possible by deriving a contradiction). Let I = max{i(b) : b £ B}. Since 
B is a finite set, I is a finite integer, and since the sequence (Aj,« > 1) is a 
nested decreasing sequence, it follows that Aj n B = 0, a contradiction. □ 

Returning to the proof of Part (ii), suppose that p(Q) ^ 0; we will show that 
p{Q) is an RAF for Q (and so, by Part (i), the unique maximal RAF for Q). 
For r £ p(Q), p(r) C cln i (F) for each i (otherwise r would not be an element of 
TZi+i and thereby fail to lie in p(Q)). Thus p(r) C f)i>i cI-r^F) C cl m (q)(F) 
by (A4). It remains to show that r is catalysed by at least one element of 
cl M (Q) (F). Let B r = {x £ X : (x,r) £ C}. By (A5), B r is hnite. Moreover, 
for each i > 1, B r n cl Hi{F) / 0 (otherwise r would fail to be in TZ l+ ^ and 
thereby not lie in p{Q)). By Lemma |2j there is a molecule type x £ B r that 
lies in nii>i c b?.; (F 1 ) and this latter set is contained in cl M (g)(J 7 ') by (A4). In 
summary, every reaction in p(Q) has all its reactants and at least one catalyst 
present in cl M (Q)(F) and so p(Q) is an RAF for Q , as claimed. 

□ 

Suppose we now remove Condition (A5) in Proposition [lj In this case, by 
a slight strengthening of (A4), we obtain a positive result (Proposition^. To 
describe this, we first require a further definition. Recall that C is the set of 
pairs (x, r) where molecule type x catalyses reaction r. Given a subset C' of 
C, let 

K[C] = {r £ TZ : (x,r) £ C' for some x £ A'}. 

Define a nested decreasing sequence of subsets Ci, C 2 ,..., by C\ 
each i> 1, 

C i+ 1 = {{x,r) £ Ci : {x}Up(r) C cl K[c .](F)}, 
and let = f|;>i Q. 

Proposition 2 Let Q satisfy (Al) and (A2), as well as the following property: 
m n cl 7 z[Ci](F) C cl^[c oo ](F), for the sequence Ci defined in |I|). 

i> 1 

Then Q has an RAF if and only if C^ ^ 0, in which case TVfOoo] is a maximal 
RAF for Q. 


= C and for 
( 2 ) 



Self-sustaining autocatalytic networks within open-ended reaction systems 


11 


Proof Suppose that 7 ^ 0. Then for any r G 72[C' 00 ] there exists x G X such 
that (x,r) G C^. It follows that (x,r) G Cy for all i. By definition, this means 
that {cc} U p(r) C cl n{ Ci ](F) for all i, and so {x} U p(r) C f|,>i d^Cd i F )- 
Now, by (A4)', this means that { 2 ?} U p(r) C cWc-i (.F). I n summary, every 
reaction in the non-empty set 72. [Coo] has all its reactants and at least one 
catalyst in the closure of F with respect to 72[C' 00 ] and so 72. [Coo] forms an 
RAF for Q. 

Conversely, suppose that Q contains an RAF 72'; we will show that 
0. For each r G 72', select a catalyst x r for r for which x r G dy z'(F). Let 
A = {( x r ,r ) : r G 72'}. We use induction on i to show that A C C for all 
i > 1 . Clearly AC C = C\, so suppose that A C C and select an element 
( x r ,r ) G A. By definition, 

M U p(r) C cl n ,(F) = cl TC[A] (F) C d n[Ci] { F ), 

which means that ( x r ,r ) G C»+i, establishing the induction step. It follows 
that 0 7 ^ AC f]i>i Cj = Coo and so 7 ^ 0 as claimed. 

□ 

Notice that, just as for condition (A4), the condition (A4)' is equivalent to 
requiring that the two sets described be identical. Notice also that, although 
condition (A4) applies to the CRS Q' 4 , condition (A4)' fails, since Cx, = 
{(s,r 1 ),(ff,r 2 ),(fff,r 3 ),...} and so d K[Coo ](F) = T = 
while s G dn[Ci]{F) f° r all i > 1, and so f]i>i c lTC[Ci] is not a subset of 

In summary, a single application of p allows us to determine when Q has 
an RAF, provided the additional condition (A4)' holds. Example 2 showed 
that some additional assumption of this type is required, however one could 
also consider other approaches for determining the existence R.AFs that do 
not assume a further condition like (A4)', but instead iterate the map p. In 
other words, consider the following ‘higher level’ sequence of subsets of 72: 

72,/i(Q), M 2 (72),---/(72)--- 

where p k (JZ) = p(p k ~ 1 ( 72)), for each k > 2. Again, this forms a decreasing 
nested sequence of subsets of 72 and so we can consider the set: 

hq) = n nm. 

i> 1 

In the example above for Q 4 where p(Q) 7 ^ 0, notice that /i 2 (72) = 0 (and 
so v(Q) = 0). It follows from Proposition nl that if p k (Q) = 0 for any k > 1 
then Q has no RAF. However, just because v{Q) 7 ^ 0, this does not imply that 
Q contains an RAF as the next example shows. 

Example 3: Consider the infinite CRS Q 5 = (X, 72, C, F) which is obtained 
by taking a countably infinite number of (reaction and molecule disjoint) copies 
of Q 4 (from Example 2) and letting the molecule type s in the *-th copy of Q4 
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play the role of the molecule t in the (* +1)—th copy of Q4. In addition, let r 0 
be the reaction / + / + /—)• u (where u> is an additional molecule) catalysed 
by the s-products of all the copies of Q4. Now p k (Q ) contains all but the first 
k copies of Q4, plus r 0 . Consequently, i/(Q) = {r 0 } but, as before, this is not 
an RAF. Notice, however that this example violates condition (A3). □ 


3 Finite RAFs in systems satisfying (Al)—(A3) 

We have seen from the last section that applying //, even infinitely often, 
does not seem to provide a way to determine whether a CRS possesses an 
RAF. However, in most applications, the main interest will generally be in 
finite RAFs. From the earlier theory it is clear that if p k (Q) is finite for some 
integer k > 1 then any RAFs that may exist for Q are necessarily finite, and 
finite in number. Moreover, if 

0 ^ p k (Q) = /.t (fe+1) (Q), for some k > 1, 

and this set is finite, then p k {Q) is the unique (and necessarily finite) maximal 
RAF for Q. However, it is also quite possible that a CRS might contain both 
finite and infinite RAFs, and in this section we describe a characterisation of 
when an RAF contains a finite RAF. 

Given a CRS Q define a sequence TZ\. 1Z ', 2 , ■ ■ ■ of subsets of 1Z as follows: 

R'i = {r € K : p(r) C F}, and 

lZ' t = {r £ TZ : p(r) C F U |^J 7r(7£')}, for each i > 1. 

1 <j<i 

In words, TZ\ is the set of reactions that have all their reactants in F, and for 
* > 1 IZi is the set of reactions for which each reactant is either an element of 
F or products of some reaction in TZj for j < i. 

Proposition 3 Suppose a CRS Q satisfies (Al)-(A3). Let Q[ = (X, TZ^, C, F) 
for all i > 1, where 1Z[ is as defined above. Then: 

(i) (fR! i : * > 1) is a nested increasing sequence of finite sets. 

(ii) Q has a finite RAF if and only if p{Q!f) 7 ^ 0 for some i > 1. 

(in) If p(Q'i) 7 ^ 0 for some i, then p(Qj) is a finite RAF for Q for all j > i. 
(iv) Every finite RAF for Q is contained in p(Qj) for some j > 1. 

Proof By (Al) and (A3), it follows that 7 Z[ is finite, and, by induction, that 
7 Z[ is finite for all i > 1. Moreover, if r £ 7 Z[ then p(r) CFU [Ji<j<i 7r (’^j) 
and so p{r) CFU (Ji<j<i+i (i- e - r e 1 ) an< ^ so se ^ s > 1 

form an increasing nested sequence. This establishes (i). For Parts (ii) and 

(iii), suppose that Q contains a finite RAF 1Z'. Since (Al) and (A2) hold, we 
can apply Lemma [T] to deduce that every reaction r £ 1Z' is an element of 7 Z[ 
for some i. Thus, since 1Z' is finite, and the sequence F' is a nested increasing 
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sequence of finite sets, it follows that TV C TV k for some fixed k, in which case 
p{Q! k ) y 0. Conversely, if /z(Q') ^ 0, then it is clear from the definitions that 
fi(Qi) is an finite RAF for Q; moreover, so also is /z(Q') for all j > i. Part 
(iv) also follows easily from the definitions, since if TV is a finite RAF for Q 
then TV C TV- for some j > 1, and since 7 V is finite we have p{TV) = TV and 
so 7 V = fi(TV) C /J,(Qj). This completes the proof. 

□ 

Theorem [3] provides an algorithm to search for finite RAFs in any infinite CRS 
that satisfies (A1)-(A3). Given Q, construct TV X and run the (standard) RAF 
algorithm [5j and [6] on TZ[. If it fails to find an RAF, then construct TV 2 and 
run the algorithm on this set, and continue in the same manner. If Q contains 
a finite RAF, then this process is guaranteed to find it, however, there is no 
assurance in advance of how long this might take (if not constraint is placed 
on the size of the how large the smallest finite RAF might be). 


4 General setting 

Finally, we show how Proposition [2] can be reformulated more abstractly in 
order to makes clear the underlying mathematical principles; the added gen¬ 
erality may also be useful for settings beyond chemical reaction systems. This 
uses the notion of “g /-compatibility” from j3], which we now explain. 

Suppose we have an arbitrary set Y and an arbitrary partially ordered set 
W, together with some functions / : 2 5 —► W and g : Y —» W. Consider the 
function ip : 2 ^ —»• 2 Y , where 

iP{A) :={y£ A: g{y) < f(A)}. 

We are interested in the non-empty subsets of Y fixed points of ip, particularly, 
when / is monotonic (i.e., where A C B => f(A) < f(B)). A subset A of Y is 
said to be gf -compatible if A is non-empty and ip(A) = A. 

The notion of an RAF can be captured in this general setting as follows. 
Given a CRS Q = (X,1Z,F,C) satisfying (A2), take Y = C and W = 2 X 
(partially ordered by set inclusion), and define / : 2 Y — > W and g : Y — > W 
as follows: 

f{A) = cl- r[ a]{F) and g((x,r)) = {i}Up(r), (3) 

where, as earlier, 1 Z[A\ is the set of reactions r £ TZ for which there is some 
x’ £ X with (x',r) £ C. Notice that / is monotonic and when Q is finite, the 
set f{A) can be computed in polynomial time in the size of Q. 

Lemma 3 Suppose we have a CRS Q satisfying (A2), and with f and g de¬ 
fined as in 0- If A is gf-compatible, thenlZ[A] is an RAF for Q. Conversely, 
if TV is an RAF for Q, then a gf-compatible set A exists with 7 Z[A] = TZ'. In 
particular, Q has an RAF if and only ifY contains a gf-compatible set. 
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Proof If A is ^/-compatible subset of Y, then for TZ! = TZ[A]. each reaction r £ 
TZ has at least one molecule type x £ X for which (. x, r) £ A. g/-compatibility 
ensures that g((x,r)) C f(A), in other words, {.t} U p(r) C cl -jz'(F) for some 
catalyst x of r. This holds for every r £ TZ' , so TZ! is an RAF for Q. Conversely, 
if 7 Z' is an RAF, then for each reaction r £ 1Z ', we can choose an associated 
catalyst x r so that {x r } U p(r) C c1-r./(F). Then A = {(x r ,r) : r £ 7Z'} is a 
^/-compatible subset of Y, with TZ[A] =1Z'. □ 

The problem of finding a ^/-compatible set (if one exists) in a general 
setting (arbitrary Y, and W, not necessarily related to chemical reaction net¬ 
works) can be solved in general polynomial time when Y is finite and / is 
monotonic and computable in finite time. This provides a natural generaliza¬ 
tion of the classical RAF algorithm. In [5], we showed how other problems 
(including a toy problem in economics) could by formulated within this more 
general framework. 

However, if we allow the set Y to be infinite, then monotonicity of / needs 
to be supplemented with a further condition on /. We will consider a condition 
(‘w-continuity’), which generalizes (A4)', and that applies automatically when 
Y is finite. We say that / : 2 5 —► W is (weakly) w-continuous if, for any nested 
descending chain A, .i > 1 of sets, we have: 

nf) A i) is a greatest lower bound for {f(Ai),i > 1}. (4) 

i> 1 

Recall that an element in a partially ordered set need not have a greatest lower 
bound (gib); but if it does, it has a unique one. Notice that when Y is finite, 
this property holds trivially, since then /(("),• >i M) = /(A„) for the last set 
A n in the (finite) nested chain. 

For a subset A of Y and k > 1, define ip^ k '{A) to be the result of applying 
function ip iteratively k times starting with A. Thus ip^(A) = ip (A) and for 
k > 1, ip( k+1 \A) = ip{ip( k \A)). Taking the particular interpretation of / and 
<7 in ([ 3 ]), the sequence ip( k \Y) is nothing more than the sequence C/ from ([ 2 ]). 

Notice that the sequence ( 1 p^(A),k > 1) is a nested decreasing sequence 
of subsets of Y. and so we may define the set: 

ip{A) := lim ip^ k \A) = H ip {k) (A), 

k—too 1 1 

fc>l 

which is a (possibly empty) subset of Y (in the setting of Proposition]^ ip(A) = 

Coo). 

Given (finite or infinite) sets Y. W, where W is partially ordered, together 
with functions / : 2 1 —» W and g : Y —► W , it is routine to verify that the 
following properties hold: 

(i) The ^/-compatible subsets of Y are precisely the non-empty subsets of Y 
that are fixed points of ip ; 

(ii) If / is monotonic then ip(Y) contains all ^/-compatible subsets of Y; in 
particular, if ip(Y) = 0, then there is no ^/-compatible subset of Y. 
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(iii) If / is w-continuous then ip{Y) is ^/-compatible, provided it is non-empty; 
in particular, if / is monotonic and w-continuous then (by (ii)) there a 
^/-compatible subset of Y exists if and only if ip(Y) is nonempty. 

(iv) Without the assumption that / is weakly w-continuous in Part (iii), it is 
possible for i/j(Y) to fail to be ^/-compatible when Y is infinite, even if / 
is monotone. 

The proof of Parts (i)-(iii) proceeds exactly as in [3j, with the addition of 
one extra step required to justify Part (iii), assuming w-continuity. Namely, 
Condition 0 ensures that if) : 2 y —> 2 y is also w-continuous in the sense that 
for any nested descending chain Ai,i > 1 of sets, we have: 

^(f] A i) = f)ip(Ai), (5) 

i> 1 i> 1 

and so ip(ip(Y)) = ip(Y). The proof of 0 from 0 is straightforward: firstly, 
C holds for any function while if y £ f)i>i ' t P( A i) , then, by definition of i/>, 
y £ Ai for all i and g{y) < f(Ai) for alii > 1 and so y £ f)i>i A i> and 9(y) < 
f{Ai) for all i > 1. Now, since w = /(Di>i A i) is a gib of {J(Ai) : i > 1}, we 
have g(y) < w for all i (i.e. g(y) < /(fli>i A i)) an d so y G A i)- Part 

(iv) follows directly from Parts (ii) and (iii). 

For Part (vi), consider the infinite CRS Q 4 in Example 2. As above, take 
Y = C. W = 2 a and, for i £ 2/ with / and g defined as in Then 
V’OO = A , where A = {(s, n), (//,r 2 ), (///, r 3 ),...} however, A is not gf- 
compatible, since (s,ri) £ A and g((s,r\)) = {s, /} but this is not a subset 
of /(A) = cl 7 j[A](F) = T since s £ T. In this example, / fails to be weakly 
w-continuous, and the argument is analogous to where we showed earlier that 
Q ' 4 fails to satisfy (A4)'. More precisely, for each i > 1, let Ai = {( c r ,r ) : 
r £ TZi}, where IZi is defined in 0 and where, for each reaction r £ Q 4 , c r 
is the unique catalyst of r. Then f{Ai) = T U {s} U {x, + i, £,+ 2 ,...} and so 
D*>i f( A i) = However, P|*>i A i = A and so /(fli>i A i) = f ( A ) = ?, 

which differs from the gib of {f(AiJ, i > 1}, namely Hi>i f( A i ) = F U {s}. □ 


5 Concluding comments 

The examples in this paper are particularly simple - indeed mostly we took the 
food set to consist of just a single molecule, and reactions often had only one 
possible catalyst. In reality more ‘realistic’ examples can be constructed, based 
on polymer models over an alphabet, however the details of those examples 
tends to obscure the underlying principles so we have kept with our somewhat 
‘toy’ examples in order that the reader can readily verify certain statements. 

Section [3] describes a process for determining whether an arbitrary infinite 
CRS (satisfying (A1)-(A3)) contains a finite RAF. However, from an algo¬ 
rithmic point of view, Proposition [3] is somewhat limited, since the process 
described is not guaranteed to terminate in any given number of steps. If no 
further restriction is placed on the (infinite) CRS, then it would seem difficult 
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to hope for any sort of meaningful algorithm; however, if the CRS has a ‘finite 
description’ (as do our main examples above), then the question of the algo¬ 
rithmic decidability of the existence of an RAF or of a finite RAF arises. More 
precisely, suppose an infinite CRS Q = (A', TZ, C, F ) consists of (i) a countable 
set of molecule types X = {x\,x%, ■ ■ ■}, where we may assume (in line with 
(Al)) that F = {xi : i < K}, for some finite value K , and (ii) a countable 
set TZ = {ri, r 2 ,...} of reactions, where 77 has a finite set a(i) of reactants, a 
finite set f3(i) of products, and a finite or countable set 7 (?) of catalysts, where 
a , /3 and 7 are computable (i.e. partial recursive) set-valued functions defined 
on the positive integers. Given this setting, a possible question for further in¬ 
vestigation is whether (and under what conditions) there exists an algorithm 
to determine whether or not Q contains an RAF, or more specifically a finite 
RAF (i.e. when is this question decidable?). 


6 Acknowledgements 

The author thanks the Allan Wilson Centre for funding support, and Wim 
Hordijk for some useful comments on an earlier version of this manuscript. 
I also thank Marco Stenico (personal communication) for pointing out that 
w-consistency is required for Part (iii) of the ^/-compatibility result above 
when Y is infinite, and for a reference to a related fixed-point result in domain 
theory (Theorem 2.3 in 1131b from which this result can also be derived. 


References 

1. P. Dittrich, P. Speroni di Fenizio, Chemical organisation theory. Bull. Math. Biol. 69, 
1199-1231 (2007) 

2. P. G. Higgs, N. Lehman, The RNA World: molecular cooperation at the origins of life. 
Nat. Rev. Genet. 16(1), 7-17 (2015) 

3. W. Hordijk, M. Steel, Autocatalytic sets extended: dynamics, inhibition, and a general¬ 
ization. J. Syst. Chem. 3:5 (2012) 

4. W. Hordijk, M. Steel, Autocatalytic sets and boundaries. J. Syst. Chem. (in press) (2014) 

5. W. Hordijk, M. Steel, Detecting autocatalyctic, self-sustaining sets in chemical reaction 
systems. J. Theor. Biol. 227(4), 451-461 (2004) 

6 . W. Hordijk, S. Kauffman, M. Steel, Required levels of catalysis for the emergence of 
autocatalytic sets in models of chemical reaction systems. Int. J. Mol. Sci. (Special issue: 
Origin of Life 2011) 12, 3085-3101 (2011) 

7. W. Hordijk, M. Steel, S. Kauffman, The structure of autocatalytic sets: evolvability, 
enablement, and emergence. Acta Biotheor. 60, 379-392 (2012) 

8 . S. A. Kauffman, Autocatalytic sets of proteins. J. Theor. Biol. 119, 1-24 (1986) 

9. S. A. Kauffman, The Origins of Order (Oxford University Press, Oxford 1993) 

10. E. Mossel, M. Steel, Random biochemical networks and the probability of self-sustaining 
autocatalysis. J. Theor. Biol. 233(3), 327-336 (2005) 

11. J. Smith, M. Steel, W. Hordijk, Autocatalytic sets in a partitioned biochemical network. 
J. Syst. Chem. 5:2 (2014) 

12. M. Steel, W. Hordijk, J. Smith, Minimal autocatalytic networks. Journal of Theoretical 
Biology 332: 96-107 (2013) 

13. V. Stoltenberg-Hansen, I. Lindstrom, E. R. Griffor, Mathematical Theory of Domains. 
Cambridge Tracts in Theoretical Computer Science 22 (Cambridge University Press, Cam¬ 
bridge 1994) 



Self-sustaining autocatalytic networks within open-ended reaction systems 


17 


14. V. Vasas, C. Fernando, M. Santos, S. Kauffman, E. Szathmary, Evolution before genes. 
Biol. Dir. 7:1 (2012) 

15. M. Villani, A. Filisetti, A. Graudenzi , C. Damiani, T. Carletti, R. Serra, Growth and 
division in a dynamic protocell model. Life 4, 837-864 (2014) 



